Quality of Service of an Asynchronous Crash-Recovery Leader Election Algorithm
نویسندگان
چکیده
In asynchronous distributed systems it is very hard to assess if one of the processes taking part in a computation is operating correctly or has failed. To overcome this problem, distributed algorithms are created using unreliable failure detectors that capture in an abstract way timing assumptions necessary to assess the operating status of a process. One particular type of failure detector is a leader election, that indicates a single process that has not failed. The unreliability of these failure detectors means that they can make mistakes, however if they are to be used in practice there must be limits to the eventual behavior of these detectors. These limits are defined as the quality of service (QoS) provided by the detector. Many works have tackled the problem of creating failure detectors with predictable QoS, but only for crash-stop processes and synchronous systems. This paper presents and analyzes the behavior of a new leader election algorithm named NFD-L for the asynchronous crash-recovery failure model that is efficient in terms of its use of stable memory and message exchanges.
منابع مشابه
Leader Election in Distributed Systems with Crash Failures
Leader election is an important problem in distributed computing. Garcia-Molina's Bully Algorithm is a classic solution to leader election in synchronous systems with crash failures. This paper shows that the Bully Algorithm can be easily adapted for use in asynchronous systems. First, we re-write the Bully Algorithm to use a failure detector, instead of explicit time-outs; this yields a modula...
متن کاملDesigning and Evaluating Fault-tolerant Leader Election Algorithms
Fault-tolerant leader election is a basic building block for dependable distributed computing, since it allows coordinating decisions among replicas such that they remain consistent. Indeed, several fault-tolerant agreement protocols rely on an eventual leader election service. This problem has been initially studied in crash-prone systems, and more recently in other failure scenarios, e.g., cr...
متن کاملLeader Election in Asynchronous Distributed Systems
In a previous paper, Garcia-Molina speci es the leader election problem for synchronous and asynchronous distributed systems with crash and link failures and gives an elegant algorithm for each type of system. This paper points out a aw in GarciaMolina's speci cation of leader election in asynchronous systems and proposes a new speci cation.
متن کاملOptimal Distributed t-Resilient Election in Complete Networks
We study the problem of distributed leader election in an asynchronous complete network, in presence of faults that occurred prior to the execution of the election algorithm. Failures of this type are encountered, for example, during a recovery from a crash in the network. For a network with n processors, k of which start the algorithm and at most t of which might he faulty, we present an algor...
متن کاملA Leader Election Protocol for Fault Recovery in Asynchronous Fully-Connected Networks
We introduce a new algorithm for consistent failure detection in asynchronous systems. Informally, consistent failure detection requires processes in a distributed system to distinguish between two diierent populations: a fault free population and a faulty one. The major contribution of this paper is in combining ideas from group membership and leader election, in order to have an election prot...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1704.06302 شماره
صفحات -
تاریخ انتشار 2017